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ABSTRACT 

We investigate in this paper the performance of a simple 
file sharing principle. For this purpose, we consider a sys- 
tem composed of N peers becoming active at exponential 
random times; the system is initiated with only one server 
offering the desired file and the other peers after becoming 
active try to download it. Once the file has been downloaded 
by a peer, this one immediately becomes a server. To inves- 
tigate the transient behavior of this file sharing system, we 
study the instant when the system shifts from a congested 
state where all servers available are saturated by incoming 
demands to a state where a growing number of servers are 
idle. In spite of its apparent simplicity, this queueing model 
(with a random number of servers) turns out to be quite dif- 
ficult to analyze. A formulation in terms of an urn and ball 
model is proposed and corresponding scaling results are de- 
rived. These asymptotic results are then compared against 
simulations. 

Categories and Subject Descriptors 

C.4 [Computer Systems Organization]: Performance of 
Systems — modeling techniques, performance attributes 

General Terms 

Queueing Systems, Transient Analysis of Markov Processes, 
File Sharing, Peer to Peer 

1. INTRODUCTION 

This paper analyzes the performance of a simple file shar- 
ing principle during a flash crowd scenario when a popular 
content becomes available on a peer-to-peer network. It is 
supposed that a given peer is willing to share a given file 
with a community of N peers, which are initially asleep. An 
asleep peer becomes active at some random time, i.e., it tries 
to download the file from a peer having the complete file. 
Once a peer has downloaded the file, it immediately becomes 
a server from which another peer can download the file. To 
simplify the model, we assume that the file is in one piece 
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and not segmented into chunks; the time needed to down- 
load the file from one server is supposed to be random in 
order to take into account the diversity of upload capacities 
of peers. 

The goal of this paper is to understand how the network 
builds up in this situation as peers join the system. In par- 
ticular, we are interested in analyzing the growth of the num- 
ber of available servers in the system. Note that there are 
eventually as many servers as peers since each of them can 
complete the file download. 

In spite of its apparent simplicity, the analysis of the system 
is quite difficult because we have to cope with a network 
comprising a random number of servers: When peers com- 
plete their download, they become new servers so that the 
number of servers is continually increasing. It is assumed 
that an incoming peer chooses a server with the smallest 
number of queued peers. Other routing policies are consid- 
ered at the end of this paper. 

The analysis performed in this paper substantially differs 
from earlier studies appeared so far in the technical litera- 
ture in the sense that we consider the transient formation of 
a network of peers. Yang and de Veciana |17| considered a 
similar setting which they analyzed with results related to 
branching processes to describe the exponential growth of 
the number of servers. Our goal in this paper is precisely to 
obtain more detailed asymptotics of this transient regime. 
Except the paper by Yang and de Veciana [17], most of the 
papers published so far on the performance of peer-to-peer 
systems assume that peers join and leave the system and 
that a steady state regime exists. The problem is then to 
evaluate the impact of some parameters of the file sharing 
protocol on the equilibrium of the system. Different tech- 
niques can be used to perform such an analysis, for instance 
by using a Markovian chain to describe the state of the sys- 
tem, possibly by using approximation techniques when the 
state space related to the number of peers in the system is 
too large. See Ge et aZ. [T] . A fluid flow analysis with an 
underlying Markovian structure is proposed in Clevenot and 
Nain [5] in order to model the Squirrel peer-to-peer caching 
system. In Qiu and Srikant |14| . the authors directly use a 
fluid approximation to study the steady state of a peer to 
peer network, subsequently complemented by diffusion vari- 
ations around the steady state solutions. In Massoulie and 
Vojnovic [13], the authors study the performance of a file 
sharing system via a stochastic coupon replication formula- 



tion, a coupon corresponding to a chunk of a file. The goal 
of this study is to understand the impact of the policy ap- 
plied by users for choosing coupons on the performance of 
the system. The system is studied in equilibrium as in Qiu 
and Srikant [13] . 

The rest of this paper is organized as follows: In Section [21 
we describe the system under consideration and some heuris- 
tics to study the system are presented. It turns out that the 
dynamics of the system can be decomposed in two regimes. 
In the first one, there are almost no empty servers and we 
establish an analogy with a random urn and ball problem 
on the real line. By approximating the probability of se- 
lecting an urn by its mean value, we analyze in Section [3] 
the corresponding deterministic urn and ball problem. The 
analysis for the random urn and ball problem is much more 
complicated to analyze. The complete analysis is done in 
[12] and only the main results are summarized in Section [5] 
In Section [S] we support via simulation the different approx- 
imations and heuristics made in this paper to analyze the 
file sharing system. Concluding remarks are presented in 
Section [7J 

2. MODEL DESCRIPTION 
2.1 Problem formulation 

We consider throughout this paper a system composed of 
N peers interested in downloading a given file. At the be- 
ginning, only one peer (the initial server) has the file and 
other peers are asleep. When becoming active, after an ex- 
ponentially distributed duration of time with parameter p, 
a peer tries to download the file from the server that is the 
less loaded in terms of number of queued peers. In partic- 
ular, the first peer becoming active downloads the file from 
the initial server. The time needed to download the file is 
assumed to be exponentially distributed with mean 1. 

Exponential distributions. The hypothesis on the distri- 
bution on the duration of the time for a peer to be active 
is quite reasonable: this is a classical situation when a large 
number of independent users may access some network. The 
assumption on the duration of the time to download is not 
realistic in practice since this quantity is related to the size 
of the file requested whose distribution is more likely to be 
bounded by the maximal size of a chunk. As it will be seen, 
even within this simplified setting (in order to have a nice 
probabilistic description of the process) , mathematical prob- 
lems turn out to be quite intricate to solve. In this respect, 
our study could be seen as a first step in the analysis of 
flash crowd scenarios. It turns out that our current investi- 
gations in the general case seem to show that the exponential 
distribution does not have a critical impact on the qualita- 
tive behavior as long as the FIFO policy is used by servers. 
Mathematically, however, numerous technical points are not 
settled in this case. 

We assume that peers requesting the file from the same 
server are served according to the FIFO discipline. Note 
that, because of the exponential distribution assumption, 
this case is equivalent to the Processor- Sharing discipline, 
i.e., when N peers are present for a duration of time h, 
each of them receives the amount of work h/N. Just af- 
ter completing the file download, a peer immediately be- 



comes a server from which other peers can retrieve the file. 
The problem of "free riders", i.e., peers who do not become 
servers after service completion, is not discussed here. As it 
will be seen, this feature does not change significantly the 
qualitative properties of the system. The problem of servers 
who disconnect while they have downloads in progress will 
not be discussed in this paper. 

It is worth noting that the model under consideration de- 
scribes a "flash crowd" scenario. Indeed, a peer having a file 
accepts to share it with other peers and we are interested in 
the dynamics of the sharing process when a large population 
of peers tries to download the file. Moreover, since the du- 
rations for which these peers stay inactive are independent 
and identically distributed, the flow of arrivals of peers into 
the system is not stationary, but rather accumulates at the 
beginning and is then less and less intense. We are hence 
interested in the transient regime of the system. Contrary 
to the earlier studies [7] 1131 114] , we are not interested in the 
steady state regime of the system, where peers continually 
join and leave the system. 

It is intuitively clear that there should exist two different 
regimes for this system. Initially, it starts congested: many 
peers request the file, and only a few servers are available. 
Afterward, the situation is reversed: there are a large num- 
ber of servers and only a few requests from the remaining 
inactive peers. 

These two regimes clearly appear in Figure [1] depicting the 
simulation results with N — 10 6 peers and p — 5/6. It 
shows that before time T w 7 time units (or equivalently 
mean download times), there are almost no empty servers, 
while after that time, more and more servers are empty until 
all peers have completed their download. But as long as 
the input rate is high, a new server immediately receives a 
customer. This is all the more true under the routing policy 
considered, since new peers entering the system choose an 
empty server if any. 




Figure 1: Fraction of idle servers: iV=10 e and p=5/6. 

2.2 A Non- Trivial Queueing Model 

From the above description, the system can be represented 
by means of a queueing system with a random number of 
queues. Initially, the system is composed of a single server, 



and once a customer has completed its service, it becomes 
a new server. Since only a finite total number of customers 
is considered, there are eventually as many servers as cus- 
tomers. 

When peer inter-arrival times and file download times are as- 
sumed to be exponentially distributed, a minimal Markovian 
representation of this queueing model requires the knowl- 
edge of the number of peers which are still asleep and the 
number of peers connected to each server. Since this Markov 
process is ultimately absorbing (all peers are servers at the 
end), the transient behavior of the system is of course the 
main object of interest in the analysis. Even in very sim- 
ple queueing systems, the transient behavior is delicate to 
analyze and much more difficult to describe than the station- 
ary behavior. The classical M/M/l queue is a good (and 
simple) example of such a situation when transient char- 
acteristics are not easy to express with simple closed form 
formulas. See Asmussen [I] for example. 

Given the multi-dimensional description (with unbounded 
dimension) of the Markov process, the system considered 
here is much more intricate and challenging. To analyze 
this system, a simpler mathematical model with urns and 
balls is used to investigate the duration of the first regime 
of this system. The specific point addressed in this paper is 
to describe the transient behavior when N becomes large. 

2.3 Modeling the First Regime 

Initially, the input rate is large and therefore a newly created 
server receives very quickly many requests from the numer- 
ous peers becoming active. The first regime described in the 
previous section and illustrated in Figure Q] is hence charac- 
terized by the fact that the duration times during which 
some servers are idle are negligible. In a second phase the 
number of empty servers begins to be significant before in- 
creasing very rapidly in the last phase. This phenomenon is 
discussed in Section [5] For the first regime, this leads us to 
describe the dynamics of the system as follows. 

Let S n be the time at which the n-th server is created, with 
the convention that So — (the initial server has label 0). 
During the n-th time interval (S n -i, S n ) for n > 1, there 
are by definition exactly n servers. So if we assume, as ar- 
gued above, that empty servers are negligible during the first 
regime, Sn Sn—x 

is well approximated by the minimum of 
n independent exponential random variables with parame- 
ter 1. The random variable S n can thus be represented as 
S n -i + E n /n, where E n is an exponential random variable 
with parameter 1 independent of the past. In particular, 
during the first regime, the following approximation is accu- 
rate. 



Approximation 1. For n £ N, as long as the system is 
still in the first regime, the instant of creation of the n-th 
server is given by S„ ~ T n , where 

n p 1 

fc=i 

and E n being i.i.d. exponential random variables with unit 
mean. 



Despite this approximation seems to be quite rough, (a rigor- 
ous mathematical formulation of the approximation S n ~ T n 
seems to be difficult to establish), Proposition [T] and the 
subsequent discussion below provide strong arguments to 
support its accuracy. In the definition of the above approx- 
imation, it is essential to determine the duration of the first 
regime, in particular to know whether S„ « T n holds or not. 

For instance, one could consider as definition for the du- 
ration of the first regime the last time when there are no 
empty servers. This time is unfortunately not a stopping 
time and turns out to be much more difficult to study. In 
Section [6] we shall consider different heuristics for evaluat- 
ing the length of the first regime. We start the analysis by 
introducing the index v defined as follows. 

Definition 1. The duration of the first regime is defined 
as S v , where v is the first index n > 1 so that one or no 
peer arrive between S n -i and S n - 

According to this definition, the first regime lasts as long as 
between the creation of two successive servers, at least two 
peers arrive in the system. The intuition behind this heuris- 
tic is that, because of the policy for the choice of servers, 
if many peers arrive in any interval, then the least loaded 
servers will receive requests from arriving peers. Thus, as 
long as many peers arrive, it is quite rare for a server to 
remain empty. 

The phase transition should occur when the number of ar- 
rivals between the creation of two successive servers is not 
sufficient to give work to empty servers which are created. 
In particular, if no peers arrive in some interval, then there 
will be at least two empty servers at the beginning of the 
next time interval. So the first time when only a few peers 
arrive in some interval should be a good indication on the 
current state of the system. A probably more natural heuris- 
tic would have been to consider the first interval in which 
no peer arrives. Nevertheless, an argument in favor of the 
former heuristic is that it enjoys the following nice property. 

Proposition 1. For n < v, at most two servers are si- 
multaneously empty in the n-th interval (S n -i, S„). 

Proof. The proof is by induction. For n = 1, the prop- 
erty is trivial, since there is only one server in the first in- 
terval. Consider now 1 < n < l>, and suppose that the 
property holds for n — 1. Since at least two peers arrive in 
the (n — l)-th interval, and since these peers are necessar- 
ily routed to empty servers, if any, there is no empty server 
just before S n -i- Therefore, just after S n -i, there are at 
most two empty servers, and so the property holds as long 
as n < v. □ 

We are now able to justify Approximation [1] Indeed, for 
n < the number of non idle servers is between n — 2 and 
n. For n large, approximately n servers are busy, thus S n — 
Sn-i is close in distribution to an exponentially distributed 
random variable with parameter n. During the first time 
intervals, the number of empty servers is negligible. Indeed, 



consider any finite index n, then it is easy to see that the 
mean number of peers that arrive in the n-th interval is 
proportional to TV. So after the creation of the n-th server, 
the mean time before the next arrival behaves as 1/TV, and 
so is very small when TV is large. This intuitively shows 
that the fraction of idle servers is initially negligible, which 
justifies Approximation [T] 

From now on, the identification of S n and T n , where the 
sequence (Tn) is defined by Equation fT}, is assumed to hold. 
Results on T„ can be assumed to hold for S n when n < v. 

3. URN AND BALL PROBLEM 

Denote by (E?, 1 < % < TV) an i.i.d. sequence of exponen- 
tially distributed random variables with parameter p. For 
i < TV, E? is the time at which the i-th peer becomes active. 

We introduce the following urn and ball model on the real 
line: The interval (T n -i,T n ) is the n-th urn and the vari- 
ables (E?, 1 < i < TV) are the locations of TV balls thrown 
on the real line. The set {T n _i < E? < T n } is simply the 
event that the i-th ball falls into the n-th urn. Conditionally 
on the sizes of the urns, i.e., on T = (T n ), we have that the 
probability of such an event (which does not depend on i) is 



P„ =P(T n _! <E p i <T n | T) 



- 1 (l - e-x^/A , (2) 



where the random variables E n , n > 1, are independent and 
exponentially distributed with mean unity. 

With the above formulation, we have then to deal with the 
following urn and ball model: 



and the sequence (T n — log n) converges almost surely to 
a finite random variable Too whose distribution is given by 
P(Too < x) = exp(-exp(-x)) for x 6 R. 

The conditional probability P„ of throwing a ball into the 
n-th urn can be written as 



Pn — i , X n — \Zn 
n P+l 



(4) 



where 



Z„ = — — e pE " / ' n ^ and X n -i — n p e 

are independent random variables. As n goes to infinity, X n 
(resp. Z n ) converges in distribution to Xoo (resp. Zoo). The 
convergence of (X n ) to Xoo holds almost surely and in L q , 
for any q > 1. 

The limiting variable Zoo has an exponential distribution 
with parameter 1 and Xoo has a Weibull distribution with 
parameter 1/p, 



r(Xoo >x) = e~ x , x > 



(5) 



Proof. Let E^ < E^ 2 ) < ■ • ■ < E^ be the variables 
(El,l < k < n) in increasing order. In particular E/ n \ = 
maxi<fc<„ E\. With the convention E/q)=0, due to stan- 
dard properties of the exponential distribution, the variables 
E(i), i = 0,. . . ,n— 1 are independent and the vari- 
able — is the minimum of n — i exponential vari- 
ables with parameter 1, i.e., has the same distribution as 
En—i/(n—i). The distribution identity ([3)l then follows. 



1. A random probability distribution V = (P n ) is given 
(urns with random sizes). 

2. N balls are thrown independently according to the 
probability distribution V . 



It is worth noting that the above urn and ball model has 
an infinite number of urns. In addition, although urn and 
ball problems have been widely studied in the literature, our 
model presents a remarkable feature: For i > 1, a ball falls 
into urn i with probability Pi which is a random variable, 
but conditionally on the sequence (P n ), this is a classical 
urn and ball problem. Mathematical results for urn models 
with random distributions are quite rare. See Kingman [11] 
and Gnedin et al. [8] and the references therein where some 
related models have been investigated. 

The random model under consideration will give us some 
information on the behavior of our system. The following 
proposition establishes a simple but important characteriza- 
tion for the asymptotic behavior of (Pn). 



Proposition 2. Let (Ej), i > 1, be independent expo- 
nential random variables with parameter 1. Then, for n G N 



l n = > — = max E k , 

^— ' k Kk<n 



(3) 



Since Z n =' n/p(l — exp(— pEi/n)), it converges in distri- 
bution to an exponential distribution with parameter one. 



Define 



where (H n ) is the sequence of harmonic numbers, H n — 
1+1/2+- ■ -+l/n. The sequence (M„) is clearly a martingale, 
it is bounded in L2 since 



1 1 \ 2 00 



E (El - 1) 



k 2 ^ k 2 

fe=i 



It therefore converges almost surely. See Williams [16] for 
example. The almost sure convergence of (T n — logn) = 
(M n + H n — logn) is thus proved. Identity (J3| gives that, 
for x > 0, 

P(T n - logn < x) = (1 - e~ x - logn ) n ~ e -6 "", 
as n goes to infinity. 
Since 

Xn _ e -pM n e p(lo S {n+X)-B n ) 
one gets the almost sure convergence of (X n ). It is easy to 



check that, for q > 0, 



n 1 

i w) = ( b+i )"!1i^ 



qp/i 



(n + l) q 



r(n) 



T(n + qp) 



r(gp) ~ r(gp), 



(6) 



when n — > oo, where F is the usual Gamma function, and 
where the last equivalence easily comes from Stirling's For- 
mula. In particular, for any q > 0, the g-th moment of 
X n is therefore bounded with respect to n. One deduces 
the convergence in L q of the sequence {X n ). Since X„ = 
exp(— p(T n — log(n + 1))), one has the equality in distribu- 
tion Xoo = exp(— Too) which gives the law of X-x,. □ 



It is important to note that the probability distribution 
V — {Pn) is a random element in the set of probability distri- 
butions on N. The decay of this distribution follows a power 
law with parameter p + 1, because according to the previous 
proposition, n p+1 P n converges in distribution to pXoaZoa. 
Using the asymptotic behavior derived in (|SJ with q = 1, it 
is easy to see that the average probability for a ball to fall 
into the n-th urn satisfies the following relation 



E(P„ 



p£i£l 

nP +i ■ 



(7) 



This equivalence suggests the introduction of a deterministic 
version of the urn and ball problem considered. 

4. DETERMINISTIC PROBLEM 
4.1 Description 

Denote by Q = (q n ) a probability distribution on N such 
that 



lim n q„ — a, 



n — > + oo 



(8) 



for some a > and 8 > 1. For each n, q n can be seen 
as the probability for a ball to fall in the n-th urn. When 
5 — p + 1 and a = pT(p), the sequence (q„) has the same 
asymptotic behavior as E(P n ) given by Equation ((7}. Hence, 
this model may be considered as the deterministic equivalent 
of the urn and ball problem defined in the previous section. 
For the sake of clarity, the problem with the probability 
distribution V (resp. Q) will be referred to as the random 
(resp. deterministic) problem. 

The deterministic problem amounts to throwing TV exponen- 
tial variables with parameter p on the half-real line, where 
this line has been divided into deterministic intervals (i n -i , t n ) 
with t n = ET„ . The main quantity of interest in the follow- 
ing is the asymptotic behavior with respect to TV of the index 
of the first urn that does not receive any ball. 



Definition 2. Let us denote byrj R (N) (resp. rjf(N)) the 
number of balls in the i-th urn when N balls have been thrown 
in the random (resp. deterministic) urn and ball problem, 
and define 



v R (N)=w£{i> 1 : Tif (TV) = 0}, 
u D (N) =inf{i > 1 : T)?(N) = 0}. 



(9) 



In view of Definition [T] to investigate the duration of the 
first regime of the system, the asymptotic behavior of the 
sequences (v R {N)) and {v D (N)) is analyzed, bmce we con- 
sider that the first regime lasts until one or no peers arrive 
between the creation of two successive servers, we should 
have to consider v'{N) = inf{i > 1 : r/i(N) < 1} to be 
rigorous. In fact, the mathematical analysis of the index of 
the first empty urn can easily be extended to the first urn 
that receives less than k balls. For the sake of simplicity, we 
therefore only treat the case k = 0. Neither the orders of 
magnitude nor the asymptotic behaviors established in the 
following are affected by the value of k, and in particular if 
we consider 1 instead of 0. 

To conclude this section, let us give a rough approximation 
of the correct order of mag nitude for u R (N) and v D (N) as TV 
gets large. Rigorous mathematical analysis is carried out in 
Section H. 21 while Section [6] compares the insights provided 
by the two models. 



Hence, in the de- 



For i > 1, E(nf (TV)) = N qi ~ aN/i 
terministic model, a finite number of balls will fall in the i-th 
urn as soon as i is of the order of N 1 ^ p+1 ' 1 as N becomes 
large. Hence we expect that in the deterministic model, 
v D {N)/k{N) converges in distribution for k(N) = 7V 1/(p+1) . 
Theorem [1] below shows that the location of the first empty 
urn is in fact slightly smaller than jV 1// ' p+1 ', i.e., of the or- 
der of (JV/ In 7V) 1,/ ' p+1 - ) . Nevertheless this heuristic approach 
gives the correct exponent in TV. 

Although E(nf (TV)) has the same asymptotic behavior, the 
corresponding heuristic approach in the case of the random 
model is more subtle. Indeed, we have 

E(r,f (TV)) = TVE(Pi) ~ N P T{p)/i p+ \ 

so the number of balls falling in the i-th urn should be of 
the order Ni~ p ~ 1 . However, in the random model, the i- 
th interval is with random length E\ji. So from T%—\, the 
next point Ti is at a distance E\ji and the first ball is at 
a distance corresponding to the minimum of TVi _p_1 i.i.d. 
exponential random variables with parameter 1. Thus, with 
this approximation, the i-th interval is empty with proba- 
bility 



TV 



-El\ = 



1 + TV/-, 



p+2 ' 



When TV — * oo, this probability is non negligible as soon 
as i is of order TV 1,/ ' P+2 ', which is significantly below what 
we found in the deterministic case. Theorem [2] below shows 
that this is indeed the correct answer. The order of mag- 
nitude is one order smaller, compared to the deterministic 
case, because of the variability of the intervals size: to some 
extent, a very small interval is generated, so that no balls 
fall in it, while in the deterministic case, some balls would 
have. 

4.2 Asymptotic Analysis 

Csaki and Foldes [6] gives the asymptotic behavior of the 
distribution of v D when TV is large. A more complete de- 
scription of the locations of the first empty urns (and not 
only for the first one) can however be achieved. For this 
purpose, the variable Wjy is defined as the number of empty 
urns whose index is less than k when TV balls have been 



thrown. This random variable is formally defined as 



Wn =^/jv,i, with I N ,i = l{ v p( N)=Q y (10) J 



The distribution of W N is analyzed when k is dependent 
on TV. First, some estimates for the mean value and the 
variance of W% are required. 



Proposition 3. Assume that the sequence (qi) is non- 
increasing. For x > 0, if 



n x (N) 



aS- 



N 
log TV 



i/s 



1+8 log log TV log x 
1 H ; r— It h 



5 log TV log TV 



where is the integral part of y > 0, i/ien 



lim E ( W 



— (a8) x ^ 5 x 



(11) 



(12) 



(13) 



Proof. For k, TV G N 

e(w$) =£(i-®) 

For < x < 1, 



r\ ^ —Nx i, \N . /1 nJV— 1 

< e — (1 — x) < a;jv(l — xjv) , 

where a;jv is the unique solution to the equation exp(— TVx) = 
(1 — x) N ~^ , since the function x — > e~ Nx — (1 — a;) has a 
maximum at point a;jv. It is easily seen that TV a; at < 2 (in 
fact Nx N -> 2 as TV -> +oo), so that for TV > 1 



sup e 

o<^<i I 

With this relation, we obtain 

fe 



(14) 



e(w&) -][)< 



2fc 

TV' 



so that for fc = k x (N) and large TV, (1 — ft)^ can be replaced 
with exp(-Nqi) in the expression of E(W^). 

For the sake of simplicity, we assume that qi = ct/i 5 , for 
i > 1. The general case of a non- increasing sequence (qi) 
follows along the same lines since the crucial relation below 
holds true with a convenient function q. One defines q(x) = 
amin(i~ , 1) for x > 0. 



r k p fc-f- 1 

J e- N « M du<J2 e ~ Nqi < J ' 



-Nq(u) 



(111. 



The difference between these two integrals is bounded by 
2exp(-aN/k s ). Now take k = k(N) with fc(TV) with the 
same order of magnitude as (TV/ log TV) 1 ^, say, k(N) ~ 
4(TV/logTV) 1/ ' 5 for some A > 0. We have 

e(w^ n) )=J e~ Nq{u) du + o(l). 



The right hand side of the above equation is given by 

k(N) 



e du 



(aN) 



1/8 paN 



aNk(N)- 6 



e -u. u -(S+l)/S du _ (15) 



Now let H(N) = aNk(Ny 6 and consider 



HN) H{N){ i +S)/ s / _ 

J H(N) 



e -u u - ( s + iy S du 



-(u-H(N)) / H\£l) 



H(N) 

N/H(N) 



(III 



H(N)e' H(N)(u ~ 1) - du 

n >"' L U ( S +1)/S 



H{N)e -H(N)u ±^ 



du ~ 1, 



since N/H(N) -> +oo and ff(TV) -> +oo as TV — ► +oo. 
Therefore, an equivalent expression of the integral in the 
right hand side of Equation (|15p has been obtained. Gath- 
ering these results, we obtain 



E 



(WT ] ) = ( -^—e- H ^H(N)-^ 6 + o(l) 



Relation (|12p is obtained by taking k(N) = k x (N). □ 



The following proposition shows the equivalence of the vari- 
ance and the mean value of W^ x under a convenient scal- 
ing. This result is crucial to prove the limit theorems of this 
section. 



Proposition 4. Assume that the sequence (qi) is non- 
increasing. For x > 0, let n x be defined by Equation (|11[) , 
then 

N \\m Var (V^ (Ar) ) /e (w^ n) ) = 1. (17) 



Proof. For k > 1, by using Equation (fT3)l (which does 
not depend on a). 

(E[W/£]) 2 = ]T 

l<ij<fc 

and 

E[(W^) 2 ] = E[W^] + ]T (1 

l<i^j<fc 

so that, to prove the equivalence of Var(VFjv) and E(W^r), 
it is sufficient to show that the quantities 

I/ 1 - * - * + QiQj) N ~ 0- ~ Qi ~ Qi) 



l<i,j<k 



and 



are negligible with respect to E(Wjy-). Since we consider 
k(N) = k x (N), this amounts to show that these quantities 
are o(l) by Proposition U The second term is the expected 
number of empty urns for the distribution (q~i) such that 
q~i ~ 2a/ i 5 . Estimate (|16|l shows that 



K X (N) 



N 1 *x(N) 



1+6 



2a5 TV 



exp[-2ctNK x (Ny 



By using the fact that for a > b > 0, a N - b N < N(a 
6)a JV_1 , the second term satisfies 

X [C 1 _ ® ~ 4? + Qi<lj) N - (1 - ?< - ffj)^] 



l<i,j<fc 



^ N X ^a? (1 - g> - gj + qiQi) 

l<i,j<k 

k 



= jj\J2 N ^ 1 ~^ N ' 1 ) • ( 18 ) 



By using a similar method as in the proof of Proposition [3] 
we obtain the equivalence 



i=i Jl 



(aS^^ax 



u)e JV? ' U ' rfu 



(aJ) 1 ' xlog AT. 



This equivalence together with Equation (|18p complete the 
proof of the proposition. □ 



Theorem 1. Let (q n ) be a non-increasing sequence sat- 
isfying Relation Q . For x > and TV G N, set 



k x (N) 



. TV 
'log TV 



^ 1 + 5 log log TV log x 



log TV 



log TV 



When TV goes to infinity, the variable converges in 

distribution to a Poisson random variable with parameter 
(aS) ' 



T/ie index is D (N) of the first empty urn defined by Equa- 
tion ([9]) is such that the variable 

(loeN) (1+s)/s n 1 + 5 

(aMQV* " W - log iV - ^ log log TV (19) 

converges in distribution to a random variable Y defined by 

P(Y > x) = exp (-(ad") 1 ^) 



Proof. Chen-Stein's method is the basic tool in the proof 
of the theorem. See Barbour et al. [1] for a detailed presen- 
tation of this powerful method. Let TV, and k be in N and 
1 < io < k. The variable Wj^ conditioned on the event 
{lN,i = 1} has the same distribution as the number of 
empty urns when the balls in the io-th urn are thrown again 
until the io-th urn is empty. It follows that the number of 



balls in any other urn is larger than in the case when they 
are assigned at first draw. One deduces that for i ^ io, 

P(/iV,i = l| /jv,i =l) <P(/w,i=l). 

The variables (ijv,i, 1 < i < k) are therefore negatively cor- 
related, see Barbour et al. [4|. Then, by [4] Corollary 2.C.2], 
the following relation holds, 



X 

p>0 



-m{w< 



< 1 - Var [W 



' t )/e(w£). 



By taking k = k x (N) and by using Propositions [3] and [4] we 
obtain the convergence in distribution of W^^* 1 to a Pois- 
son distribution with parameter (aS^^x. The last part of 
the theorem is a simple consequence of the identity P(W^ = 
0) =P(i/ 3 (TV) > k). □ 

The convergence in distribution of v D (N) has been proved 
by Csaki and Foldes 6 with a different method. Our result 
gives a more accurate description of the location of empty 
urns (and not only the first one) near the index k x (N). 

The following corollary is a straightforward application of 
the detailed asymptotics obtained in the above theorem. 

Corollary 1 (Cutoff phenomenon). Under the as- 
sumption of Theorem]!^ if 

fc(TV) = (TV/log N) 1/s , 

then, as TV goes to infinity, the following convergence in dis- 
tribution holds: For /3 > 0, 



W 



/3k(N) 
N 



+oo if 13 > (aS) 1/s , 
if (5 < (aS) 1/s . 



So far, only indexes of empty urns have been considered. 
The result below shows that the first empty urn happens at a 
time of the order of log TV. Remembering the approximation 
of the peer to peer system, it suggests that the time the 
system begins to serve quickly the incoming peers should be 
of the same order. 



Corollary 2 (First Empty Urn). Let 



v (N) 



T D (TV)=7> (iv) = 



El 
k ■ 



(20) 



Under the assumptions of Theorem]!^ the quantity 



5T D (TV) - log TV + log log TV - Iog(aJ) 

converges in distribution to Too, where Toe is the random 
variable defined in Proposition^ 



Proof. If Vjv is the variable defined by Expression (fl9|) . 
then 

5 log v D (TV) - log(TV) + log log TV - log(a<5) 

= 51 ° S (io~lV (^ + l°gW+ log log TV 



Since by Theorem [T] the sequence (V„) converges in distri- 
bution, it implies that the right hand side of the above ex- 
pression converges in distribution to 0. Proposition [5] shows 
that 

Ei + -£ + ■■■ + — -log n 
2 n 

converges almost surely to Too- □ 

5. RANDOM PROBLEM 

For the random model, the probability P n of selecting the 
n-th urn is given by Equation @ of Proposition [5] In the 
(almost sure) limit as n goes to infinity, X n ~ X x and 
in distribution, Z n is asymptotically an exponentially dis- 
tributed random variable with parameter 1. The sequence 
{P n ) can be approximated by 

where (-E n ) are i.i.d. exponential variables with unit means. 

In spite of the fact that the decay of P n follows a power law, 
the random factor plays an important role. This factor is 
composed of two variables, one (namely Xoo) is fixed once 
for all and the other (namely Z n ) changes for every urn. 
The fact that Z n , related to the "width" of the n-th urn, can 
be arbitrarily small with a positive probability suggests that 
the index v R of the first empty urn should be smaller than 
the corresponding quantity for the deterministic case. This 
is indeed true but the situation in this case is much more 
complex to analyze. The complete analysis of the random 
case is given in [12| , and only sketches of proof are given 
for Proposition [5] and Theorem [2] in the present paper. It 
must be noticed that a similar problem where Xoo and the 
sequence (E n ,n > 1) are independent is fairly easy to solve. 
However here, these random variables are dependent, and 
this dependency requires quite technical probabilistic tools. 

To derive asymptotic results for v R , as in the previous sec- 
tion, the asymptotic behavior of the random variable Wjy 
defined by 

k 

Wn = yjN,i with Jjv,, = lr^ (JV)=0 i. 

i=l 

is investigated. Although in the deterministic case, Chen- 
Stein's method makes it possible to reduce the analysis of 
Wjv to its first and second moments, this is no longer the case 
for the random problem. Indeed, because of the variability 
of the urns sizes, the random variables (In.i, 1 < i < k) are 
no longer negatively correlated. Moreover, the ratio of the 
expected value to the variance of W^ N ^ does not converge 
to 1 for a convenient sequence (k(N)) as in the determin- 
istic case (Proposition [4| , which suggests that if a limit in 
distribution exists, it cannot be Poisson. 

As was pointed out in Hwang and Janson [10], the sequence 
(NPi , 1 < i < k) plays a central role in the limiting behavior 
of (Wn). The following technical proposition gives a result 
on the asymptotic behavior of this sequence. It is important 
since it introduces the scale N 1 ^ p+2 '> which turns out to be 
the correct scaling for the variable v R (N); see [12] for the 
proof. 



Proposition 5. Let x > 0. When N goes to infinity, 
the random sequence (NPi, I < i < xN 1 ^ p+2 * 1 ) converges 
in distribution to a doubly stochastic Poisson process with a 
random intensity x p+2 (X 00 p(p + 2)) 

Proof. Because of the technicality involved, we only give 
a sketch of the proof. The reader is referred to [12] for more 
details. 

To prove the convergence of the sequence of point processes 
Mn = J2iLV s {N Pl } with k(N) = xN 1/{p+2) , it is enough 
to show the convergence of the Laplace transforms of these 
point processes applied to some suitable functions. Non- 
negative continuous functions with a compact support would 
be enough to prove the result, but the next theorem requires 
a slightly stronger result, namely it requires the converge 
of Laplace transforms for non-negative continuous functions 
vanishing at infinity, i.e., that for any function / > con- 
tinuous vanishing at infinity, we have 

where, conditionally on Xoo, Moo is a Poisson process with 
intensity x p+2 X^- /(p + 2 )- 

The general idea is to condition on the random variable Xoo . 
However, for each n > 1, Xoo and Z„ are dependent, so that 
this cannot be directly done. Instead, the first step of the 
proof is to show that only the last terms of the point pro- 

cess matter, i.e., that E(e _A ' Jv(/) ) and E(e~ ^f>(*) n iJ ) 
have the same limit, for any sequence /3(N) <C k(N). So we 
are left with large indexes i > (3(N), for which the approxi- 
mation Pi — pi~ p ~ 1 XiZi ~ pi~ p ~ x Xp^N^Zi can be justified. 
The main tool behind this approximation is Doob's Inequal- 
ity applied to the reversed martingale 

fc>n 

And now, due to this approximation, it is perfectly rig- 
orous to condition on Tn = cr(Ek,k < (3(N)): since for 
i > P(N), Zi is independent of X^jv), we are exactly left 
with proving the result for the sequence of point processes 
= Y. k i3%\ 5 {Nx N i-f-^z z } with any converging sequence 
xn — » Xoo {xn has to be thought as being equal to pA^jv))- 
If / has a compact support, it is possible to conclude by 
applying a result from Grigelionis [5] to show that this se- 
quence of point processes converges to a Poisson process with 
intensity x p+2 /(xoo{p + 2)). In the general case, the conver- 
gence is shown thanks to computations, by controlling the 
speed at which Zi converge in law to an exponential random 
variable. □ 



This result together with standard poissonization techniques 
make it possible to prove the following theorem, which is the 
main result of this section. 

Theorem 2. Let k(N) = A 1/(p+2) . For x > 0, W^ {N) 
converges in distribution to a Poisson random variable with 
a random parameter a; p+2 (A 00 p(p + 2)) 1 when N —* oo. 



Proof. Again, only a sketch of the proof is given. The 
first step of the proof is to show the result for the random 
variable W^, where Vn is a Poisson random variable 
with parameter N, independent of everything else so far. 
The idea is that the law of W^ N ^ is not sensitive to the 
fluctuations of Vn around its mean value, equal to N, so 
that the law of W^, K and of W^ kI ' N ' > will have the same 
asymptotic behavior. 

To show the convergence of W^, K , we consider its gener- 
ating function: for u > and k £ N, we can compute 

E(«<.) =E (e^.M 1 -(>-«) t " , ' Pi )) = lje-%W) , 

where M N ,k = Yh=i 5 {JVP 4 }, and f u (x) = -log(l - (1 - 
u)e~ x ) for a; > 0. Then J °°(l— e - ^") = 1— u, so that we con- 
clude with the previous proposition that W^p^ converges 
to a random variable which, conditionally on Xoo , is a Pois- 
son random variable with parameter x p+2 (Xcx>p(p + 2)) 
The fact that W^^ N ^ and H^*'^ have the same asymptotic 
behavior (in law) then follows by standard arguments. □ 

This theorem readily yields the following corollary. 

Corollary 3. The random variable u R (N)/ti(N) con- 
verges in distribution to a random variable Y such that 

W(Y>x)=E(e- xP+2x ~ /Mp+2)) ). 

Finally, if T R (N) d = T^r^ n ^ then, for the convergence in 
distribution, 

P& = — - (2D 

N^+oo log(iV) p+2 V ' 

The fact that the parameter of the limiting Poisson law is 
random has important effects, especially concerning the ex- 
pectation. Indeed, it stems from Equation ((5} and Propo- 
sition [2] that limE(W / ^' c ^ JV ' ) ) is proportional to EX^, 1 and 
EX^ 1 < +oo if and only if p < 1. Note in particular that 
the value p = 1 plays a special role for our system. 

For p > 1, the mean value of diverges because it 

happens that a finite number of intervals (actually, the [pj 
first intervals) capture most of the balls. This event happens 
with an increasingly small probability, so that in the limit 
as N goes to infinity, it does not have any impact on our 
system. However, for a fixed N, this event happens with 
a fixed probability as well. For instance, we commonly ob- 
served on various simulations for p = 2 and N = 10000 that 
more than 95% of the peers go to the first server, which is 
clearly an undesirable behavior of the system. 

6. DISCUSSION 

In this section, a set of simulations of the file sharing prin- 
ciple is presented to test the different approximations made 
in this paper in term of urn and ball models. These sim- 
ulations are in particular used to justify Approximation [T] 
as well as to compare the insights into the dynamics of the 



system provided by the two urn and ball models studied in 
this paper. Moreover, another server selection policy is con- 
sidered, namely when an incoming peer chooses the server 
at random. 

Throughout this section, we discuss the relevance of sev- 
eral random variables. The goal is to assess the accuracy of 
the procedure consisting of estimating the length of the first 
regime by using the random variable v specified in Defini- 
tion [T] For this purpose, we define different times: 

1. T\ is the first time when two servers are created and 
less than 2 peers have arrived. 

2. Tb is the last time when there is an empty server. 

3. T3 is the first time when the input rate is smaller than 
the output rate (see Section T6.3p . 

4. T4 the first time when a server becomes empty, i.e., 
when a peer leaves a server where it was alone. 

We consider the corresponding quantities Vi'. for i = 1, 2, 3, 4, 
Vi is the index of the interval (Si-i,Si) in which the event 
corresponding to Ti happens. In particular, v\ corresponds 
to Definition [T] In every simulation, the averages of the 
quantities Vi and Ti are calculated for the value p = 2 over 
10 4 iterations of the system which proved to be sufficient in 
term of numerical stability. The number of peers TV ranges 
up to 5.10 7 . 

6. 1 Validation of Approximation [1] 

Definition[T]specifies the variable considered throughout this 
paper to determine the duration of the first regime of the file 
sharing system. This variable was chosen for two reasons. 
First, it is a good indicator of the current equilibrium of 
the system: the output rate begins to be comparable with 
the input rate when only a few peers arrive between the 
creation of two successive servers. Moreover, the stopping 
time defined in this way is mathematically tractable when 
transposed into the context of a certain urn and ball model. 
Compared to [T7], it is interesting to note that we are ac- 
tually able to rigorously prove results, and not only rely 
on simulation. As a byproduct, the mathematical problems 
arising in this context are interesting in themselves. 

For the sake of completeness, several points need to be ad- 
dressed. First, for how long is Approximation[T]valid? Since 
the random variable v specified in Definition [T] corresponds 
to V\ , we argued in Section POI that this approximation holds 
until Ti . This is the main assumption that makes it possible 
to cast our problem in terms of urns and balls, and to derive 
precise results on v\ and Ti. 

In order to validate the results of Section [S] we check that 
E(i/i) and E(T\) behave as predicted by Theorem [21 From 
this theorem, we expect to have E(5?i) ~ A-i_N 1 ^ p+2 ' ) for 
some constant Ai, and E(Ti) » \og(N)/(p + 2). Figure [5] 
shows the graphs log(E(57i)) and E(Ti) versus log(TV): the 
straight lines depicted prove a good agreement with the the- 
ory. Moreover, via a fitting procedure, one can compute the 
slopes of these lines: the results are summarized in Table [T] 




Figure 2: log(E(£i)) (solid) and E(Ti) (dashed), p = 2. 
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Figure 3: The times E(Ti) < E(T 2 ) and E(T D ) when 
p = 2. 



The values of interest in Table[T]are in the row labeled "Min": 
we see that simulations exhibit a slope of 0.248 for v\ and of 
0.256 for Ti, whereas the theory predicts 0.25 in both cases 
(because p = 2). These results are in good agreement with 
Approximation [T] which justifies the fact that we can use 
this approximation up to time Ti. 



Table 1: Coefficients of growth rates of Fig. [2l and [3l 

in the case p = 2 



Policy 


fi 


Ti 


v% 


Ti 




T 4 


Min 


0.2478 


0.2565 


0.3765 


0.5146 


0.3149 


0.3287 


Random 


0.2470 


0.2575 


0.3711 


0.5078 


0.2383 


0.2530 



6.2 Accuracy of Urn and Ball Models 

In this section, we compare the random and deterministic 
urn and ball models with E(Ta), the expected value of the 
last time when there is an empty server. It clearly appears 
in Figure [T] that Tb closely corresponds to the shift in equi- 
librium of the system, and this fact has been observed in nu- 
merous simulations. However, as we will see in the following, 
Approximation[T]does not hold until time T2 , which explains 
why it is very challenging from a mathematical point of view 
to derive results on Tb. (Note in addition that Tb is not a 
stopping time). 

Figure [3] shows that Ti is much smaller than Tb : This re- 
sult is nevertheless not surprising. Indeed, as discussed in 
Section [2l results obtained for the random model point out 
a local behavior: the first empty urn arrives in a region, 
where still many peers arrive in each interval. Although 
many peers should arrive in this time interval, this is in re- 
ality not the case because a very small interval is generated. 
Thus, in some sense, the order of magnitude jV 1// ' p+2 - ) pro- 
vided by the random urn and ball model is misleading for 
the initial system. 



and the stochastic fluctuations arising in the random model 
do not occur. The deterministic model smooths the local 
behavior that appears in the random model, and the order 
of magnitude (N/ log JST) 1 ' +1 ' gives more insight into the 
global situation of the system. When only a few peers arrive 
in an interval, it really means that the equilibrium begins to 
shift. One can check in Figure [3] that the theoretical result 
T D defined by Equation (|20p predicted by the deterministic 
model is closer to to T 2 than to Ti . 

Although considering the deterministic model indeed im- 
proves the approximation, Tb still seems much larger that 
T D . However, thanks to our urn and ball models, we know 
that the first order approximation of the times Ti is logarith- 
mic, whereas the first order approximation for the indexes 
Vi is polynomial. Table [1] provides useful information to un- 
derstand the situation. 

First, the deterministic model yields a reasonable estimate 
of the exponent in v<z: simulations give 0.376 and the de- 
terministic model 0.333. Note that the random model pre- 
dicts 0.25, so a substantial improvement in accuracy is ob- 
tained when using the deterministic model. This suggests 
that Approximation [T] holds until T D , i.e., up to times of 
order N 1/{p+1) . 

Second, we observe a significant discrepancy between the ex- 
ponent for V2 and the coefficient of Tb: If Approximation [T] 
were to hold until Tb, one would have T 2 ~ J2T E l/k, which 
would yield, because v 2 « iV ' 38 , that Tb ~ 0.38 log(iV). 
However, we find that the time T2 is better approximated 
by 0.52 log(iV), and so Approximation [1] does not hold until 
time Tb. This clearly poses the challenge to derive asymp- 
totic results for Tb. Moreover, this triggers another inter- 
esting question: For how long does Approximation [1] hold? 
We give a partial answer to this question by considering the 
times T3 and T4 in the next section. 



6.3 On the Duration of Approximation 1 

Throughout this paper, we have tried to estimate the time 
when the equilibrium of the system begins to shift. As long 
In the deterministic model, the sizes of urns are not random, as Approximation 1 holds, the input rate i(t) of the system 



is the number of peers, that are not active at time t, times 
p, while the output rate o(t) is just the number of servers at 
time t (since the service has mean one). Initially, i(0) = pN 
and o(0) = 1, and i(oo) = and 0(00) = N. To study 
the time at which the equilibrium of the system begins to 
shift, it is therefore very natural to consider the first time 
T3 at which i(t) < o(t), i.e., when the number of servers is 
greater than p times the number of non-active peers. As 
shown in the following, considering this time leads to the 
order of magnitude given by the deterministic model (with 
less precise asymptotics of course). 



where an incoming peer selects the least loaded server, in 
terms of number of peers. This policy is compared against 
the random one, where an incoming peer selects a server 
uniformly at random among all possible servers. 

Simulations show that these policies are very close as shown 
in Figures 21 E] and [U The only noticeable difference is 
concerning E^), cf. Figure[7] However, Table[T]shows that 
the exponents of E(Z<3) are very similar in the random and 
in the minimum policy. One can easily check that they are 
indeed proportional one to another. 



For times t < T3, we assume that Approximation [T] holds, so 
that we can cast I/3 in terms of our urn and ball problem. Let 
Zfj be the number of balls that fall in the x first intervals: 

x N 

Zn = =E 1 {Bf<T x }- 

i=l i=l 

The index vs then corresponds to 



v 3 = inf <j x : N - Z% = f Z X N < - 

P 



The asymptotic behavior of E(Z^r) 
with N is easy to derive: 



when x goes to infinity 



¥.(Z%) = N J2 EPl ~ aN J2 



-p-i 



—Nx 



Therefore E(Z%) « x for x « jV 1/(p+1) , ie p 3 is of order 
iV 1/(p+1) , which is the same order of magnitude as in the 
deterministic model. Rigorous mathematical analysis could 
be done to prove this result, but in our view, considering Ti 
has one main advantage: Proposition [T] is almost a rigorous 
justification of Approximation [1] When considering another 
time, in particular T3, we were not able to provide such a 
strong justification. And as we have seen in the case of Tb, 
Approximation [1] does not hold for the whole first regime, 
and a strong justification as Proposition \T\ is therefore very 
valuable. 

Finally, let us give some brief results on T4, the first time 
when a server empties. Simulations show that va and T4 
have similar behavior as before (polynomial and logarithmic 
growths, respectively). Results in Table [T] show that the 
slope for T4 is similar to the exponent of 1/4, suggesting that 
Approximation [1] holds until T4. 

In conclusion, Approximation[T]holds at least until N 1 ' ( p+1 > , 
which corresponds to Ti and T3. However, it does not hold 
until Ti , whereas Figure [1] shows that until T2 , the system 
is still in the first regime. For the particular value p = 2, we 
have v D w ^iV 0,33 and simulations show that v% w A 2 N - 38 , 
and so our approximation by the means of a urn and ball 
problem is not so far from the exponent that we want to 
capture. Proposition [T] shows that until v D , there are only 
few empty servers: so between T D and T2, it could happen 
that there is a fraction of empty servers, and although this 
fraction is small, it has an impact on the system. Similar 
phenomenon have been observed in Sanghavi et al. |15j . 

To conclude this section, we discuss a different routing pol- 
icy. Throughout this paper, we have considered the policy 
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Figure 4: Comparison of Min and Random for E(Ti) 
when p = 2 
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Figure 5: Comparison of Min and Random for E(^i) 
when p = 2 



Table [T] shows that for the first time when a server becomes 
empty, the policy has a great influence. This is easily un- 
derstandable: In the min case, it is much harder for a server 
to become empty, because least loaded servers are selected 
by incoming peers. 

7. CONCLUSION 

The simulations moreover underlined the existence of a sec- 
ond regime during which although the fraction of idle servers 
is small, the output rate is no longer as high as possible. 
This second regime is then followed by a third regime during 
which the capacity offered by the system exceeds by far the 
input rate, and so the system mainly creates empty servers. 
Our urn and ball approach can no longer be applied to these 
two regimes, and so they will be studied in the near future 
using other probabilistic techniques. 



17 
16.5 

16 
15.5 

15 



a 14.5 



14 



Min 




Random 






N 



1000 
900 
800 
700 
600 
500 
400 

D 

3 300 

H 

200 



when p = 2 



A possible extension of our results consists of incorporat- 
ing the possibility for a peer to leave the system right after 
completing its download. In terms of urn and ball, this just 
amounts to change the parameter that defines the length of 
the n-th interval: instead of n, one would just have to con- 
sider pn if p is the probability for a peer to become a server 
after completing its download. An extended model where 
the file is split into different chunks essentially amounts to 
study a multi-class queueing network with a random num- 
ber of servers of different classes which proves to be a much 
more difficult problem. 

Finally, a natural extension is to consider a general service 
distribution, instead of the exponential one. In this case, 
the process of creation of servers can be described as an age- 
dependent branching process, and more precisely a binary 
Bellman-Harris branching process. See Athreya [2J [3J. If 
this setting complicates significantly the analysis of the file 
sharing system, it seems that most of the results obtained 
in the exponential case should still hold. 
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